Machine Learning Methods with Noisy, Incomplete or Small Datasets

نویسندگان

چکیده

In this article, we present a collection of fifteen novel contributions on machine learning methods with low-quality or imperfect datasets, which were accepted for publication in the special issue “Machine Learning Methods Noisy, Incomplete Small Datasets”, Applied Sciences (ISSN 2076-3417). These papers provide variety approaches to real-world problems where available datasets suffer from imperfections such as missing values, noise artefacts. Contributions applied sciences include medical applications, epidemic management tools, methodological work, and industrial among others. We believe that will bring new ideas solving challenging problem, clear examples application scenarios.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Machine Learning with Large Datasets

This paper introduces new algorithms and data structures for quick counting for machine learning datasets. We focus on the counting task of constructing contingency tables, but our approach is also applicable to counting the number of records in a dataset that match conjunctive queries. Subject to certain assumptions, the costs of these operations can be shown to be independent of the number of...

متن کامل

Authoritative Citation KNN Learning with Noisy Training Datasets

In this paper, we investigate the effectiveness of Citation K-Nearest Neighbors (KNN) learning with noisy training datasets. We devise an authority measure associated with each training instance that changes based on the outcome of Citation KNN classification. The authority is increased when a citer’s classification had been right; and vice versa. We show that by modifying only these authority ...

متن کامل

Performance of Various Machine Learning Classifiers on Small Datasets with Varying Dimensionalities: A Study

Classification is an important supervised learning technique that is used by many applications. An important factor on which the performance of a classifier depends is the size of the dataset using which the classifier is going to be trained. In this manuscript the authors have analyzed five different classification techniques (namely decision trees, KNN, SVM, linear discriminant and Ensemble m...

متن کامل

Effects of information and machine learning algorithms on word sense disambiguation with small datasets

Current approaches to word sense disambiguation use (and often combine) various machine learning techniques. Most refer to characteristics of the ambiguity and its surrounding words and are based on thousands of examples. Unfortunately, developing large training sets is burdensome, and in response to this challenge, we investigate the use of symbolic knowledge for small datasets. A naïve Bayes ...

متن کامل

Cached Sufficient Statistics for Efficient Machine Learning with Large Datasets

This paper introduces new algorithms and data st.ruct,ures for quick rounting for machine learning dat.asets. We focus on t,he counting task of constructing contingent:. t.ables, but our approach is also applicahlc t.o counting the number of records in a dataset that match conjunctive queries. Subject to certain assumptionsl t h c rosts of thesr operations ca,n he shown to be independent of the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applied sciences

سال: 2021

ISSN: ['2076-3417']

DOI: https://doi.org/10.3390/app11094132